Topological entropy of DNA sequences

نویسنده

  • David Koslicki
چکیده

MOTIVATION Topological entropy has been one of the most difficult to implement of all the entropy-theoretic notions. This is primarily due to finite sample effects and high-dimensionality problems. In particular, topological entropy has been implemented in previous literature to conclude that entropy of exons is higher than of introns, thus implying that exons are more 'random' than introns. RESULTS We define a new approximation to topological entropy free from the aforementioned difficulties. We compute its expected value and apply this definition to the intron and exon regions of the human genome to observe that as expected, the entropy of introns are significantly higher than that of exons. We also find that introns are less random than expected: their entropy is lower than the computed expected value. We also observe the perplexing phenomena that introns on chromosome Y have atypically low and bimodal entropy, possibly corresponding to random sequences (high entropy) and sequences that posses hidden structure or function (low entropy). AVAILABILITY A Mathematica implementation is available at http://www.math.psu.edu/koslicki/entropy.nb CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy concepts and DNA investigations

Topological and metric entropies of the DNA sequences from different organisms were calculated. Obtained results were compared each other and with ones of corresponding artificial sequences. For all envisaged DNA sequences there is a maximum of heterogeneity. It falls in the block length interval [5,7]. Maximum distinction between natural and artificial sequences is shifted on 1-3 position from...

متن کامل

A Generalized Topological Entropy for Analyzing the Complexity of DNA Sequences

Topological entropy is one of the most difficult entropies to be used to analyze the DNA sequences, due to the finite sample and high-dimensionality problems. In order to overcome these problems, a generalized topological entropy is introduced. The relationship between the topological entropy and the generalized topological entropy is compared, which shows the topological entropy is a special c...

متن کامل

Entropy operator for continuous dynamical systems of finite topological entropy

In this paper we introduce the concept of entropy operator for continuous systems of finite topological entropy. It is shown that it generates the Kolmogorov entropy as a special case. If $phi$ is invertible then the entropy operator is bounded with the topological entropy of $phi$ as its norm.

متن کامل

Structural Complexity of DNA Sequence

In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-base...

متن کامل

Computing the Topological Entropy of Multimodal Maps via Min-Max Sequences

We derive an algorithm to recursively determine the lap number (minimal number of monotonicity segments) of the iterates of twice differentiable l-modal map, enabling to numerically calculate the topological entropy of these maps. The algorithm is obtained by the min-max sequences—symbolic sequences that encode qualitative information about all the local extrema of iterated maps.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 27 8  شماره 

صفحات  -

تاریخ انتشار 2011